智能论文笔记

Parsing with Multilingual BERT, a Small Corpus, and a Small Treebank

Ethan C. Chau , Lucy H. Lin , Noah A. Smith

分类：自然语言处理

2020-09-29

预处理的多语言上下文表示表现出了巨大的成功，但是由于其预处理数据的限制，其好处并不适用于所有语言品种。这给这些模型不熟悉的语言品种带来了挑战，这些模型的标签\ emph {和未标记的}数据太限制了无法有效训练单语模型。我们建议使用其他特定于语言的预审进和词汇增强，以使多语言模型适应低资源设置。使用依赖性解析四种不同的低资源语言品种作为案例研究，我们表明，这些方法显着改善了基准的性能，尤其是在最低的资源案例中，并证明了此类模型的数据和目标之间关系的重要性语言品种。

translated by 谷歌翻译

Examining Political Rhetoric with Epistemic Stance Detection

Ankita Gupta , Su Lin Blodgett , Justin H Gross , Brendan O'Connor

分类：自然语言处理

2022-12-29

Participants in political discourse employ rhetorical strategies -- such as hedging, attributions, or denials -- to display varying degrees of belief commitments to claims proposed by themselves or others. Traditionally, political scientists have studied these epistemic phenomena through labor-intensive manual content analysis. We propose to help automate such work through epistemic stance prediction, drawn from research in computational semantics, to distinguish at the clausal level what is asserted, denied, or only ambivalently suggested by the author or other mentioned entities (belief holders). We first develop a simple RoBERTa-based model for multi-source stance predictions that outperforms more complex state-of-the-art modeling. Then we demonstrate its novel application to political science by conducting a large-scale analysis of the Mass Market Manifestos corpus of U.S. political opinion books, where we characterize trends in cited belief holders -- respected allies and opposed bogeymen -- across U.S. political ideologies.

translated by 谷歌翻译

Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive Learning

Jishnu Mukhoti , Tsung-Yu Lin , Omid Poursaeed , Rui Wang , Ashish Shah , Philip H. S. Torr , Ser-Nam Lim

分类：计算机视觉

2022-12-09

We introduce Patch Aligned Contrastive Learning (PACL), a modified compatibility function for CLIP's contrastive loss, intending to train an alignment between the patch tokens of the vision encoder and the CLS token of the text encoder. With such an alignment, a model can identify regions of an image corresponding to a given text input, and therefore transfer seamlessly to the task of open vocabulary semantic segmentation without requiring any segmentation annotations during training. Using pre-trained CLIP encoders with PACL, we are able to set the state-of-the-art on the task of open vocabulary zero-shot segmentation on 4 different segmentation benchmarks: Pascal VOC, Pascal Context, COCO Stuff and ADE20K. Furthermore, we show that PACL is also applicable to image-level predictions and when used with a CLIP backbone, provides a general improvement in zero-shot classification accuracy compared to CLIP, across a suite of 12 image classification datasets.

translated by 谷歌翻译

Single-shot Foothold Selection and Constraint Evaluation for Quadruped Locomotion

D. Belter , J. Bednarek , H. -C. Lin , G. Xin , M. Mistry

分类：机器人

2022-12-01

In this paper, we propose a method for selecting the optimal footholds for legged systems. The goal of the proposed method is to find the best foothold for the swing leg on a local elevation map. We apply the Convolutional Neural Network to learn the relationship between the local elevation map and the quality of potential footholds. The proposed network evaluates the geometrical characteristics of each cell on the elevation map, checks kinematic constraints and collisions. During execution time, the controller obtains the qualitative measurement of each potential foothold from the neural model. This method allows to evaluate hundreds of potential footholds and check multiple constraints in a single step which takes 10~ms on a standard computer without GPGPU. The experiments were carried out on a quadruped robot walking over rough terrain in both simulation and real robotic platforms.

translated by 谷歌翻译

Efficient Mirror Detection via Multi-level Heterogeneous Learning

Ruozhen He , Jiaying Lin , Rynson W. H. Lau

分类：计算机视觉

2022-11-28

We present HetNet (Multi-level \textbf{Het}erogeneous \textbf{Net}work), a highly efficient mirror detection network. Current mirror detection methods focus more on performance than efficiency, limiting the real-time applications (such as drones). Their lack of efficiency is aroused by the common design of adopting homogeneous modules at different levels, which ignores the difference between different levels of features. In contrast, HetNet detects potential mirror regions initially through low-level understandings (\textit{e.g.}, intensity contrasts) and then combines with high-level understandings (contextual discontinuity for instance) to finalize the predictions. To perform accurate yet efficient mirror detection, HetNet follows an effective architecture that obtains specific information at different stages to detect mirrors. We further propose a multi-orientation intensity-based contrasted module (MIC) and a reflection semantic logical module (RSL), equipped on HetNet, to predict potential mirror regions by low-level understandings and analyze semantic logic in scenarios by high-level understandings, respectively. Compared to the state-of-the-art method, HetNet runs 664$\%$ faster and draws an average performance gain of 8.9$\%$ on MAE, 3.1$\%$ on IoU, and 2.0$\%$ on F-measure on two mirror detection benchmarks.

translated by 谷歌翻译

Raising the Bar on the Evaluation of Out-of-Distribution Detection

Jishnu Mukhoti , Tsung-Yu Lin , Bor-Chun Chen , Ashish Shah , Philip H. S. Torr , Puneet K. Dokania , Ser-Nam Lim

分类：计算机视觉 | 机器学习

2022-09-24

在图像分类中，在检测分布（OOD）数据时发生了许多发展。但是，大多数OOD检测方法是在一组标准数据集上评估的，该数据集与培训数据任意不同。没有明确的定义``好的''ood数据集。此外，最先进的OOD检测方法已经在这些标准基准上取得了几乎完美的结果。在本文中，我们定义了2类OOD数据使用与分布（ID）数据的感知/视觉和语义相似性的微妙概念。我们将附近的OOD样本定义为感知上相似但语义上与ID样本的不同，并将样本转移为视觉上不同但在语义上与ID相似的点数据。然后，我们提出了一个基于GAN的框架，用于从这两个类别中生成OOD样品，给定一个ID数据集。通过有关MNIST，CIFAR-10/100和Imagenet的广泛实验，我们表明A）在常规基准上表现出色的ART OOD检测方法对我们提出的基准测试的稳健性明显较小。 N基准测试，反之亦然，因此表明甚至可能不需要单独的OOD集来可靠地评估OOD检测中的性能。

translated by 谷歌翻译

Ontologizing Health Systems Data at Scale: Making Translational Discovery a Reality

Tiffany J. Callahan , Adrianne L. Stefanski , Jordan M. Wyrwa , Chenjie Zeng , Anna Ostropolets , Juan M. Banda , William A. Baumgartner Jr. , Richard D. Boyce , Elena Casiraghi , Ben D. Coleman

分类：人工智能

2022-09-10

通用数据模型解决了标准化电子健康记录（EHR）数据的许多挑战，但无法将其集成深度表型所需的资源。开放的生物学和生物医学本体论（OBO）铸造本体论提供了可用于生物学知识的语义计算表示，并能够整合多种生物医学数据。但是，将EHR数据映射到OBO Foundry本体论需要大量的手动策展和域专业知识。我们介绍了一个框架，用于将观察性医学成果合作伙伴关系（OMOP）标准词汇介绍给OBO铸造本体。使用此框架，我们制作了92,367条条件，8,615种药物成分和10,673个测量结果的映射。域专家验证了映射准确性，并且在24家医院进行检查时，映射覆盖了99％的条件和药物成分和68％的测量结果。最后，我们证明OMOP2OBO映射可以帮助系统地识别可能受益于基因检测的未诊断罕见病患者。

translated by 谷歌翻译

FOLIO: Natural Language Reasoning with First-Order Logic

Simeng Han , Hailey Schoelkopf , Yilun Zhao , Zhenting Qi , Martin Riddell , Luke Benson , Lucy Sun , Ekaterina Zubova , Yujie Qiao , Matthew Burtell

分类：自然语言处理

2022-09-02

我们介绍了一项对自然语言（NL）推理的人类通知，开放域和逻辑上复杂且多样的数据集，配备了一阶逻辑（fol）注释。对开本由1,435个示例（独特的结论）组成，每个示例与487组前提之一搭配，这些场所作为规则，可用于演绎理由，以理解每个结论的有效性。前提和结论的逻辑正确性是通过其平行注释来确保的，这些注释会自动由我们的FOL推理引擎验证。除了主要的NL推理任务外，对开本中的NL-FOL对自动构成了使用FOL作为逻辑形式的新的NL-FOL翻译数据集。我们对广泛的实验系统地评估了对中型语言模型（BERT，ROBERTA）进行微调的FOL推理能力，并且在大型语言模型（GPT-NEOX，OPT，OPT，GPT-3，Codex）上促成了很少的射击。对于NL-FOL翻译，我们尝试使用GPT-3和Codex。我们的结果表明，公开可用的最强大的大语言模型之一（LLM），GPT-3 Davinci，仅比随机结果略好，而在一部分集的一部分中，该模型尤其不好，并且在预测该模型方面尤其不好。纠正虚假和未知结论的真实价值。我们的数据集和代码可在https://github.com/yale-lily/folio上找到。

translated by 谷歌翻译

Paired Cross-Modal Data Augmentation for Fine-Grained Image-to-Text Retrieval

Hao Wang , Guosheng Lin , Steven C. H. Hoi , Chunyan Miao

分类：计算机视觉

2022-07-29

本文研究了一个开放的研究问题，即生成文本图像对，以改善细粒度对文本跨模式检索任务的训练，并通过发现stylegan2模型的隐藏语义信息，为配对数据增强的新颖框架提出了一个新颖的框架。。具体来说，我们首先在给定数据集上训练stylegan2模型。然后，我们将真实图像投影回stylegan2的潜在空间，以获取潜在的代码。为了使生成的图像可操作，我们进一步引入了一个潜在的空间对齐模块，以了解StyleGAN2潜在代码和相应的文本字幕功能之间的对齐。当我们进行在线配对数据增强时，我们首先通过随机代码替换生成增强文本，然后将增强文本传递到潜在的空间对齐模块中以输出潜在代码，最终将其馈送到stylegan2以生成增强图像。我们评估了增强数据方法对两个公共跨模式检索数据集的功效，其中有希望的实验结果表明，可以将增强的文本图像对数据与原始数据一起训练，以增强图像到文本交叉 - 模态检索性能。

translated by 谷歌翻译

3D Cartoon Face Generation with Controllable Expressions from a Single GAN Image

Hao Wang , Guosheng Lin , Steven C. H. Hoi , Chunyan Miao

分类：计算机视觉

2022-07-29

在本文中，我们调查了一项开放的研究任务，该任务是从单个2D GAN产生人体面部且没有3D监督的3D卡通面部形状，在那里我们还可以操纵3D形状的面部表情。为此，我们发现了Stylegan潜在空间的语义含义，因此我们能够通过控制潜在代码来产生各种表达式，姿势和照明的面部图像。具体而言，我们首先对卡通数据集中预验证的Stylegan脸部模型进行了修复。通过将相同的潜在代码喂入面部和卡通生成模型，我们的目标是实现从2D人脸图像到卡通风格的化身的翻译。然后，我们发现了甘恩潜在空间的语义方向，以试图在保留原始身份的同时改变面部表情。由于我们没有任何针对卡通脸的3D注释，因此我们操纵潜在代码以生成具有不同姿势和照明的图像，以便我们可以重建3D卡通脸部形状。我们在定性和定量上验证了方法在三个卡通数据集上的疗效。

translated by 谷歌翻译